Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Instance selection algorithm for big data based on random forest and voting mechanism
ZHOU Xiang, ZHAI Junhai, HUANG Yajie, SHEN Ruicai, HOU Yingzhen
Journal of Computer Applications    2021, 41 (1): 74-80.   DOI: 10.11772/j.issn.1001-9081.2020060982
Abstract499)      PDF (906KB)(494)       Save
To deal with the problem of instance selection for big data, an instance selection algorithm based on Random Forest (RF) and voting mechanism was proposed for big data. Firstly, a dataset of big data was divided into two subsets:the first subset is large and the second subset is small or medium. Then, the first large subset was divided into q smaller subsets, and these subsets were deployed to q cloud computing nodes, and the second small or medium subset was broadcast to q cloud computing nodes. Next, the local data subsets at different nodes were used to train the random forest, and the random forest was used to select instances from the second small or medium subset. The selected instances at different nodes were merged to obtain the subset of selected instances of this time. The above process was repeated p times, and p subsets of selected instances were obtained. Finally, these p subsets were used for voting to obtain the final selected instance set. The proposed algorithm was implemented on two big data platforms Hadoop and Spark, and the implementation mechanisms of these two big data platforms were compared. In addition, the comparison between the proposed algorithm with the Condensed Nearest Neighbor (CNN) algorithm and the Reduced Nearest Neighbor (RNN) algorithm was performed on 6 large datasets. Experimental results show that compared with these two algorithms, the proposed algorithm has higher test accuracy and smaller time consumption when the dataset is larger. It is proved that the proposed algorithm has good generalization ability and high operational efficiency in big data processing, and can effectively solve the problem of big data instance selection.
Reference | Related Articles | Metrics
Mass and calcification classification method in mammogram based on multi-view transfer learning
XIAO He, LIU Zhiqin, WANG Qingfeng, HUANG Jun, ZHOU Ying, LIU Qiyu, XU Weiyun
Journal of Computer Applications    2020, 40 (5): 1460-1464.   DOI: 10.11772/j.issn.1001-9081.2019101744
Abstract381)      PDF (1943KB)(279)       Save

In order to solve the problem of insufficient available training data in the classification task of breast mass and calcification, a multi-view model based on secondary transfer learning was proposed combining with imaging characteristics of mammogram. Firstly, CBIS-DDSM (Curated Breast Imaging Subset of Digital Database for Screening Mammography) was used to construct the breast local tissue section dataset for the pre-training of the backbone network, and the domain adaptation learning of the backbone network was completed, so the backbone network had the essential ability of capturing pathological features. Then, the backbone network was secondarily transferred to the multi-view model and was fine-tuned based on the dataset of Mianyang Central Hospital. At the same time, the number of positive samples in the training was increased by CBIS-DDSM to improve the generalization ability of the network. The experimental results show that the domain adaption learning and data augmentation strategy improves the performance criteria by 17% averagely and achieves 94% and 90% AUC (Area Under Curve) values for mass and calcification respectively.

Reference | Related Articles | Metrics
Adaptive hierarchical searchable encryption scheme based on learning with errors
ZHANG En, HOU Yingying, LI Gongli, LI Huimin, LI Yu
Journal of Computer Applications    2020, 40 (1): 148-156.   DOI: 10.11772/j.issn.1001-9081.2019060961
Abstract444)      PDF (1430KB)(359)       Save
To solve the problem that the existing hierarchical searchable encryption scheme cannot effectively resist quantum attack and cannot flexibly add and delete the level, a scheme of Adaptive Hierarchical Searchable Encryption based on learning with errors (AHSE) was proposed. Firstly, the proposed scheme was made to effectively resist the quantum attack by utilizing the multidimensional characteristic of lattices and based on the Learning With Errors (LWE) problem on lattices. Secondly, the condition key was constructed to divide the users into different levels clearly, making the user only able to search the files at his own level, so as to achieve effective level access control. At the same time, a segmented index structure with good adaptability was designed, whose levels could be added and deleted flexibly, meeting the requirements of access control with different granularities. Moreover, all users in this scheme were able to search by only sharing one segmented index table, which effectively improves the search efficiency. Finally, theoretical analysis shows that the update, deletion and level change of users and files in this scheme is simple and easy to operate, which are suitable for dynamic encrypted database, cloud medical system and other dynamic environments.
Reference | Related Articles | Metrics
Pneumothorax detection and localization in X-ray images based on dense convolutional network
LUO Guoting, LIU Zhiqin, ZHOU Ying, WANG Qingfeng, CHENG Jiezhi, LIU Qiyu
Journal of Computer Applications    2019, 39 (12): 3541-3547.   DOI: 10.11772/j.issn.1001-9081.2019050884
Abstract283)      PDF (1217KB)(306)       Save
There are two main problems about pneumothorax detection in X-ray images. The pneumothorax usually overlaps with tissues such as ribs and clavicles in X-ray images, easily causing missed diagnosis and the performance of the existing pneumothorax detection methods remain to be improved. The suspicious pneumothorax area detection cannot be exploited by the convolutional neural network-based algorithms, lacking the interpretability. Aiming at the problems, a novel method combining Dense convolutional Network (DenseNet) and gradient-weighted class activation mapping was proposed. Firstly, a large-scale chest X-ray dataset named PX-ray was constructed for model training and testing. Secondly, the output node of the DenseNet was modified and a sigmoid function was added after the fully connected layer to classify the chest X-ray images. In the training process, the weight of cross entropy loss function was set to alleviate the problem of data imbalance and improve the accuracy of the model. Finally, the parameters of the last convolutional layer of the network and the corresponding gradients were extracted, and the areas of the pneumothorax type were roughly located by gradient-weighted class activation mapping. The experimental results show that, the proposed method has the detection accuracy of 95.45%, and has the indicators such as Area Under Curve (AUC), sensitivity, specificity all higher than 0.9, performs the classic algorithms of VGG19, GoogLeNet and ResNet, and realizes the visualization of pneumothorax area.
Reference | Related Articles | Metrics
Identification method of depressive tendency with multiple feature fusion
ZHOU Ying, WANG Hong, REN Yanju, HU Xiaohong
Journal of Computer Applications    2019, 39 (1): 168-175.   DOI: 10.11772/j.issn.1001-9081.2018051180
Abstract387)      PDF (1395KB)(257)       Save
In recent years, the tendency of depression tends to occur at a younger age and affects more people. Although research on the topic has achieved some results, it still lacks a more objective and accurate method for identifying depressive tendencies, and research on depressive tendencies from multiple perspectives is lacking. Therefore, the combination of mental health self-check table and eye-tracking was proposed as a method for identifying depressive tendencies and was studied from multiple perspectives. The innovative features of eye movement, memory, cognitive style, and network behaviors were incorporated. In order to address complex feature relationship and extract more useful information, a scanning process with combining a stacking method was proposed to form a proposed recognition model for depressive tendencies called scanning stacking model. To comprehensively and objectively evaluate the performance of scanning and stacking model, the independent contributions of both scanning process and stacking method were evaluated in the experiment. The experimental results show that the independent contribution of scanning process is 0.03, and the independent contribution of stacking method is 0.02. In addition, the scanning stacking model was compared with several models from parameter R-squared, Mean Square Error (MSE) and average absolute error, and the results show that the scanning stacking model has better prediction effect.
Reference | Related Articles | Metrics
Effect of Web advertisement based on multi-modal features under the influence of multiple factors
HU Xiaohong, WANG Hong, REN Yanju, ZHOU Ying
Journal of Computer Applications    2018, 38 (4): 987-994.   DOI: 10.11772/j.issn.1001-9081.2017102425
Abstract399)      PDF (1247KB)(395)       Save
Although the relevant research on Web advertisement effect has achieved good results, there are still a lack of thorough research on the interaction between advertisement and each blue link in a Web page, as well as a lack of thorough analysis of the impact of user characteristics and advertising features, and advertising metrics are also inappropriate. Therefore, a method based on multi-modal feature fusion was proposed to study the effectiveness of Internet advertising and user behavior patterns under the influence of multiple factors. Through the quantitative analysis of multi-modal features, the attractiveness of advertising was verified, and the attention effects under different conditions were summarized. By mining frequent patterns of user behavior information and combining with the characteristics of the data, the Directional Frequent Browsing Patterns (DFBP) algorithm was proposed to directionally mine the most common browsing patterns of users with fixed-length. Memory was used as a new index to measure the quality of advertising, and the random forest algorithm was improved by frequent pattern, then a new advertising memory model was built by fusing multimodal features. Experimental results show that the memory model has an accuracy of 91.64%, and it has good robustness.
Reference | Related Articles | Metrics
Speckle suppression algorithm for ultrasound image based on Bayesian nonlocal means filtering
FANG Hongdao, ZHOU Yingyue, LIN Maosong
Journal of Computer Applications    2018, 38 (3): 848-853.   DOI: 10.11772/j.issn.1001-9081.2017071780
Abstract549)      PDF (1122KB)(426)       Save
Ultrasound imaging is one of the most important diagnostic techniques of modern medical imaging. However, due to the presence of multiplicative speckle noise, the development of ultrasound imaging has been limited. For this problem, an improved strategy for Bayesian Non-Local Means (NLM) filtering algorithm was proposed. Firstly,a Bayesian formulation was applied to derive an NLM filter adapted to a relevant ultrasound noise model, which leads to two methods of calculating distance between the image blocks, the Pearson distance and the root distance. Secondly, to lighten the computational burden, a image block pre-selection process was used to accelerate the algorithm when a similar image block was selected in the non-local area. In addition, the relationship between parameter and noise variance was determined experimentally, which made the parameter being adaptive to the noise.Finally, the VS (Visual Studio) and OpenCV (Open source Computer Visual library) were used to realize the algorithm, making the program running time greatly reduced. In order to evaluate the denoising performance of the proposed algorithm, experiments were conducted on both phantom images and real ultrasound images. The experimental results show that the algorithm has a great improvement in the performance of removing speckle noise and achieves satisfactory results in terms of preserving the edges and image details, compared with some existing classical algorithms.
Reference | Related Articles | Metrics
Detection of SQL injection behaviors for PHP applications
ZHOU Ying, FANG Yong, HUANG Cheng, LIU Liang
Journal of Computer Applications    2018, 38 (1): 201-206.   DOI: 10.11772/j.issn.1001-9081.2017071692
Abstract731)      PDF (1074KB)(395)       Save
The SQL (Structured Query Language) injection attack is a threat to Web applications. Aiming at SQL injection behaviors in PHP (Hypertext Preprocessor) applications, a model of detecting SQL injection behaviors based on tainting technology was proposed. Firstly, an SQL statement was obtained when an SQL function was executed, and the identity information of the attacker was recorded through PHP extension technology. Based on the above information, the request log was generated and used as the analysis source. Secondly, the SQL parsing process with taint marking was achieved based on SQL grammar analysis and abstract syntax tree. By using tainting technology, multiple features which reflected SQL injection behaviors were extracted. Finally, the random forest algorithm was used to identify malicious SQL requests. The experimental results indicate that the proposed model gets a high accuracy of 96.9%, which is 7.2 percentage points higher than that of regular matching detection technology. The information acquisition module of the proposed model can be loaded in an extended form in any PHP application; therefore, it is transplantable and applicable in security audit and attack traceability.
Reference | Related Articles | Metrics
Non-local means denoising algorithm based on image segmentation
XU Su, ZHOU Yingyue
Journal of Computer Applications    2017, 37 (7): 2078-2083.   DOI: 10.11772/j.issn.1001-9081.2017.07.2078
Abstract828)      PDF (1066KB)(523)       Save
Focusing on the problems of non-adaption of filtering parameters and edge blur of Non-Local Means (NLM) algorithm, an improved NLM denoising algorithm based on image segmentation was proposed. The proposed algorithm is composed of two phases. In the first phase, the filtering parameter was determined according to the noise level and image structure, and traditional NLM algorithm was used to remove the noise and generate the rough clean image. In the second phase, the estimated clean image was divided into detailed region and background region based on pixel variance, and the image patches belonged to different regions were denoised separately. To effectively remove the noise, the back projection was utilized to make full use of the residual structure from the method noise of the first phase. The experimental results show that compared with traditional NLM and three NLM-improved algorithms, the proposed algorithm achieves higher Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM), while maintaining more structure details and edges, making the denoised image clear and retaining the complete real information.
Reference | Related Articles | Metrics
Image denoising algorithm based on sparse representation and nonlocal similarity
ZHAO Jingkun, ZHOU Yingyue, LIN Maosong
Journal of Computer Applications    2016, 36 (2): 551-555.   DOI: 10.11772/j.issn.1001-9081.2016.02.0551
Abstract706)      PDF (1050KB)(955)       Save
For the problem of denoising images corrupted by mixed noise such as Additive White Gaussian Noise (AWGN) with Salt-and-Pepper Impulse Noise (SPIN) and Random-Valued Impulse Noise (RVIN), an improved image restoration algorithm on the basis of the existing weighted encoding method was proposed. The image priors about sparse representation and non-local similarity were integrated. Firstly, the sparse representation based on the dictionary was used to build a variational denoising model and a weighting factor was designed for data fidelity term to suppress impulse noise. Secondly, the method of non-local means was used to get an initialized denoised image and then a mask matrix was built to remove impulse noise points to get the good non-local similarity prior knowledge. Finally, the image sparsity prior and non-local similarity prior were integrated into the regularization of the variational model. The final denoised image was obtained by solving the variational model. The experimental results show that in different noise ratios, the Peak Signal-to-Noise Ratio (PSNR) of the proposed algorithm increased 1.7 dB than that of fuzzy weighted non-local means filter, and the Feature Similarity Index (FSIM) increased 0.06. Compared with weighted encoding method, the PSNR increased 0.64 dB, and the FSIM increased 0.03. The proposed method has better recovery performance especially for the texture strong images and can retain real information of the image.
Reference | Related Articles | Metrics
Traffic behavior feature based DoSⅅoS attack detection and abnormal flow identification for backbone networks
ZHOU Yingjie JIAO Chengbo CHEN Huinan MA Li HU Guangmin
Journal of Computer Applications    2013, 33 (10): 2838-2841.  
Abstract830)      PDF (808KB)(713)       Save
The existing methods for backbone networks only analyze coarse-grained network traffic characteristic parameters. Thus, they cannot guarantee both the premise of abnormal flow identification and the real-time detection for DoS (Denial of Service) & DDoS (Distributed Denial of Service, DDoS) attacks. Concerning this problem, a DoSⅅoS attack detection and abnormal flow identification method for backbone networks was proposed. First, it analyzed coarse-grained network traffic characteristic parameters to determine the time points that abnormal behaviors occur; then, fine-grained traffic behavior characteristic parameters were analyzed in these time points to find the destination IP addresses that correspond to abnormal behaviors; finally, comprehensive analysis was conducted for extracted traffic that correspond to abnormal behaviors to determine DoS and DDoS attacks. The simulation results show that, the proposed method can effectively detect DoS attacks and DDoS attacks in backbone networks. Meanwhile, it could accurately identify the abnormal traffic, while real-time detection is ensured.
Related Articles | Metrics
Distributed data stream clustering algorithm based on affinity propagation
ZHANG Jianpeng JIN Xin CHEN Fucai CHEN Hongchang HOU Ying
Journal of Computer Applications    2013, 33 (09): 2477-2481.   DOI: 10.11772/j.issn.1001-9081.2013.09.2477
Abstract726)      PDF (839KB)(471)       Save
As to the low clustering quality and high communication cost of the existed distributed clustering algorithm, a distributed data stream clustering algorithm (DAPDC) which combined the density with the idea of representative points clustering was proposed. The concept of the class cluster representative point to describe the local distribution of data flows was introduced in the local sites using affinity propagation clustering, while the global site got the global model by merging the summary data structure that was uploaded from the local site by the improved density clustering algorithm. The simulation results show that DAPDC can improve the clustering quality of data streams in distributed environment significantly. Simultaneously, the algorithm can find the clusters of different shapes and reduce the amount of data transferred significantly by using class cluster representative points.
Related Articles | Metrics